Binary Neural Networks Algorithms, Architectures, and Applications (Baochang Zhang, Sheng Xu, Mingbao Lin etc.)

156

Applications in Computer Vision

Algorithm 11 BiRe-ID Training

Input: The training dataset, and the hyper-parameters such as initial learning rate, weight

decay, convolution stride and padding size.

Output: BiRe-ID model with weights b^w, learnable scale factors α, and other parameters

1: Initialize w, α, p, and WD randomly;

2: repeat

Randomly sample a mini-batch from dataset;

// Forward propagation

for all i = 1 to N convolution layer do

b^aⁱ= sign(Φ(αi ◦b^aⁱ⁻¹⊙b^wⁱ));

end for

// Backward propagation

for all l = L to 1 do

10:

Update the kernel reﬁning discriminators D(·) of GAN by ascending their stochastic

gradients:

11:

∇D(log(D(wi; WD)) + log(1 −D(b^wⁱ◦αi; WD)));

12:

Update the feature reﬁning discriminators D(·) of GAN by ascending their stochas-

tic gradients:

13:

∇D(log(D(a^∗

H^;^W^D^{)) +}^log⁽¹⁻^D⁽^a^L^;^W^D^)));

14:

Calculate the gradients δwi; // Using Eq. 7-12

15:

wi ←wi −η1δwi; // Update the weights

16:

Calculate the gradient δαi; // Using Eq. 13-16

17:

αi ←αi −η2δαi; // Update the scale factor

18:

Calculate the gradient δpi; // Using Eq. 13-16

19:

pi ←pi −η3δpi; // Update other parameters

20:

end for

21: until the maximum epoch

22: b^w= sign(w).

6.2.5

Ablation Study

In this section, we conduct a performance study for the components of BiRe-ID, including

kernel MSE loss (hyperparameter λ), KR-GAL, feature MSE loss (hyperparameter μ) and

FR-GAL. Market-1501 [289] and ResNet-18 are used in this experiment. We separate this

subsection into two parts: selecting hyperparameters and evaluating the components of

BiRe-ID.

Selecting Hyper-Parameters We ﬁrst set the kernel reﬁning GAL (KR-GAL) and the

feature reﬁning GAL (FR-GAL) as the invariant variable to compare the impact of the

hyperparameter λ and μ on the ResNet-18 backbone. As plotted in Fig. 6.2, we set the

ablation study at λ and μ. We vary λ from 0 to 1e−4 and μ from 0 to 1e−2 to evaluate BiRe-

ID’s mAP with diﬀerent hyperparameter settings. From bottom to top, BiRe-ID obtains

the obviously better mAPs with μ set as 5e −3 (green mAP curve). From left to right,

BiRe-ID obtains the best mAP with λ set as 5e −5. Therefore, we set μ and λ as 5e −3

and 5e −5 experiments on the Re-ID task.

Evaluating the Components of BiRe-ID As shown in Table 6.5, the use of GANs

dramatically increases the performance of the proposed baseline network. More speciﬁcally,

we ﬁrst introduce our baseline network by adding a single BN layer ahead of the 1-bit

convolutions of XNOR-Net, which brings a 14.1% improvement in mAP. The introduction

of KR-GAL and FR-GAL improves mAP by 7.1% and 4.1%, respectively, on the proposed